73 research outputs found

    Learning pseudo-Boolean k-DNF and Submodular Functions

    Full text link
    We prove that any submodular function f: {0,1}^n -> {0,1,...,k} can be represented as a pseudo-Boolean 2k-DNF formula. Pseudo-Boolean DNFs are a natural generalization of DNF representation for functions with integer range. Each term in such a formula has an associated integral constant. We show that an analog of Hastad's switching lemma holds for pseudo-Boolean k-DNFs if all constants associated with the terms of the formula are bounded. This allows us to generalize Mansour's PAC-learning algorithm for k-DNFs to pseudo-Boolean k-DNFs, and hence gives a PAC-learning algorithm with membership queries under the uniform distribution for submodular functions of the form f:{0,1}^n -> {0,1,...,k}. Our algorithm runs in time polynomial in n, k^{O(k \log k / \epsilon)}, 1/\epsilon and log(1/\delta) and works even in the agnostic setting. The line of previous work on learning submodular functions [Balcan, Harvey (STOC '11), Gupta, Hardt, Roth, Ullman (STOC '11), Cheraghchi, Klivans, Kothari, Lee (SODA '12)] implies only n^{O(k)} query complexity for learning submodular functions in this setting, for fixed epsilon and delta. Our learning algorithm implies a property tester for submodularity of functions f:{0,1}^n -> {0, ..., k} with query complexity polynomial in n for k=O((\log n/ \loglog n)^{1/2}) and constant proximity parameter \epsilon

    Monotonicity testing

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1999.Includes bibliographical references (p. 41-42).by Sofya Raskhodnikova.S.M

    Brief Announcement: Erasure-Resilience Versus Tolerance to Errors

    Get PDF
    We describe work in progress on providing a separation between erasure-resilient and tolerant property testing. Specifically, we are able to exhibit a property which is testable (with the number of queries independent of the length of the input) in the presence of erasures, but is not testable tolerantly

    Tolerant Testers of Image Properties

    Get PDF
    We initiate a systematic study of tolerant testers of image properties or, equivalently, algorithms that approximate the distance from a given image to the desired property (that is, the smallest fraction of pixels that need to change in the image to ensure that the image satisfies the desired property). Image processing is a particularly compelling area of applications for sublinear-time algorithms and, specifically, property testing. However, for testing algorithms to reach their full potential in image processing, they have to be tolerant, which allows them to be resilient to noise. Prior to this work, only one tolerant testing algorithm for an image property (image partitioning) has been published. We design efficient approximation algorithms for the following fundamental questions: What fraction of pixels have to be changed in an image so that it becomes a half-plane? a representation of a convex object? a representation of a connected object? More precisely, our algorithms approximate the distance to three basic properties (being a half-plane, convexity, and connectedness) within a small additive error epsilon, after reading a number of pixels polynomial in 1/epsilon and independent of the size of the image. The running time of the testers for half-plane and convexity is also polynomial in 1/epsilon. Tolerant testers for these three properties were not investigated previously. For convexity and connectedness, even the existence of distance approximation algorithms with query complexity independent of the input size is not implied by previous work. (It does not follow from the VC-dimension bounds, since VC dimension of convexity and connectedness, even in two dimensions, depends on the input size. It also does not follow from the existence of non-tolerant testers.) Our algorithms require very simple access to the input: uniform random samples for the half-plane property and convexity, and samples from uniformly random blocks for connectedness. However, the analysis of the algorithms, especially for convexity, requires many geometric and combinatorial insights. For example, in the analysis of the algorithm for convexity, we define a set of reference polygons P_{epsilon} such that (1) every convex image has a nearby polygon in P_{epsilon} and (2) one can use dynamic programming to quickly compute the smallest empirical distance to a polygon in P_{epsilon}. This construction might be of independent interest

    Property testing : theory and applications

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2003.Includes bibliographical references (p. 107-111).(cont.) We show upper and lower bounds for the general problem and for specific partial orders. A few of our intermediate results are of independent interest. 1. If strings with a property form a vector space, adaptive 2-sided error tests for the property have no more power than non-adaptive 1-sided error tests. 2. Random LDPC codes with linear distance and constant rate are not locally testable. 3. There exist graphs with many edge-disjoint induced matchings of linear size. In the final part of the thesis, we initiate an investigation of property testing as applied to images. We study visual properties of discretized images represented by n x n matrices of binary pixel values. We obtain algorithms with the number of queries independent of n for several basic properties: being a half-plane, connectedness and convexity.Property testers are algorithms that distinguish inputs with a given property from those that are far from satisfying the property. Far means that many characters of the input must be changed before the property arises in it. Property testing was introduced by Rubinfeld and Sudan in the context of linearity testing and first studied in a variety of other contexts by Goldreich, Goldwasser and Ron. The query complexity of a property tester is the number of input characters it reads. This thesis is a detailed investigation of properties that are and are not testable with sublinear query complexity. We begin by characterizing properties of strings over the binary alphabet in terms of their formula complexity. Every such property can be represented by a CNF formula. We show that properties of n-bit strings defined by 2CNF formulas are testable with O([square root of]n) queries, whereas there are 3CNF formulas for which the corresponding properties require Q(n) queries, even for adaptive tests. We show that testing properties defined by 2CNF formulas is equivalent, with respect to the number of required queries, to several other function and graph testing problems. These problems include: testing whether Boolean functions over general partial orders are close to monotone, testing whether a set of vertices is close to one that is a vertex cover of a specific graph, and testing whether a set of vertices is close to a clique. Testing properties that are defined in terms of monotonicity has been extensively investigated in the context of the monotonicity of a sequence of integers and the monotonicity of a function over the m-dimensional hypercube (1,... , a)m. We study the query complexity of monotonicity testing of both Boolean and integer functions over general partial orders.by Sofya Raskhodnikova.Ph.D

    The Power and Limitations of Uniform Samples in Testing Properties of Figures

    Get PDF
    We investigate testing of properties of 2-dimensional figures that consist of a black object on a white background. Given a parameter epsilon in (0,1/2), a tester for a specified property has to accept with probability at least 2/3 if the input figure satisfies the property and reject with probability at least 2/3 if it does not. In general, property testers can query the color of any point in the input figure. We study the power of testers that get access only to uniform samples from the input figure. We show that for the property of being a half-plane, the uniform testers are as powerful as general testers: they require only O(1/epsilon) samples. In contrast, we prove that convexity can be tested with O(1/epsilon) queries by testers that can make queries of their choice while uniform testers for this property require Omega(1/epsilon^{5/4}) samples. Previously, the fastest known tester for convexity needed Theta(1/epsilon^{4/3}) queries

    Erasures vs. Errors in Local Decoding and Property Testing

    Get PDF
    We initiate the study of the role of erasures in local decoding and use our understanding to prove a separation between erasure-resilient and tolerant property testing. Local decoding in the presence of errors has been extensively studied, but has not been considered explicitly in the presence of erasures. Motivated by applications in property testing, we begin our investigation with local list decoding in the presence of erasures. We prove an analog of a famous result of Goldreich and Levin on local list decodability of the Hadamard code. Specifically, we show that the Hadamard code is locally list decodable in the presence of a constant fraction of erasures, arbitrary close to 1, with list sizes and query complexity better than in the Goldreich-Levin theorem. We use this result to exhibit a property which is testable with a number of queries independent of the length of the input in the presence of erasures, but requires a number of queries that depends on the input length, n, for tolerant testing. We further study approximate locally list decodable codes that work against erasures and use them to strengthen our separation by constructing a property which is testable with a constant number of queries in the presence of erasures, but requires n^{Omega(1)} queries for tolerant testing. Next, we study the general relationship between local decoding in the presence of errors and in the presence of erasures. We observe that every locally (uniquely or list) decodable code that works in the presence of errors also works in the presence of twice as many erasures (with the same parameters up to constant factors). We show that there is also an implication in the other direction for locally decodable codes (with unique decoding): specifically, that the existence of a locally decodable code that works in the presence of erasures implies the existence of a locally decodable code that works in the presence of errors and has related parameters. However, it remains open whether there is an implication in the other direction for locally list decodable codes. We relate this question to other open questions in local decoding

    Sublinear-Time Computation in the Presence of Online Erasures

    Get PDF
    We initiate the study of sublinear-time algorithms that access their input via an online adversarial erasure oracle. After answering each query to the input object, such an oracle can erase tt input values. Our goal is to understand the complexity of basic computational tasks in extremely adversarial situations, where the algorithm's access to data is blocked during the execution of the algorithm in response to its actions. Specifically, we focus on property testing in the model with online erasures. We show that two fundamental properties of functions, linearity and quadraticity, can be tested for constant tt with asymptotically the same complexity as in the standard property testing model. For linearity testing, we prove tight bounds in terms of tt, showing that the query complexity is Θ(logt)\Theta(\log t). In contrast to linearity and quadraticity, some other properties, including sortedness and the Lipschitz property of sequences, cannot be tested at all, even for t=1t=1. Our investigation leads to a deeper understanding of the structure of violations of linearity and other widely studied properties. We also consider implications of our results for algorithms that are resilient to online adversarial corruptions instead of erasures
    corecore